Speaker Identity Indexing in Audio-visual Documents
نویسندگان
چکیده
The identity of persons in audiovisual documents represents very important semantic information for content-based indexing and retrieval. The task of speaker’s identity detection can be carried out by exploiting data elements resulting from different modalities (text, image and audio). In this article, we propose an approach for speaker identity indexing in broadcast news using audio content. After a speaker segmentation phase, an identity is given to speech segments by applying linguistic patterns to their transcription from speech recognition. Three types of patterns are used to predict the speaker in the previous, current and next speech segments. Predictions are then propagated to other segments by similarity at the acoustic level. Evaluations have been conducted on part of the TREC 2003 corpus: a speaker identity could be assigned to 53% of the annotated corpus with an 82% precision.
منابع مشابه
Dependent Video Indexing Based on Audio - VisualInteractionS
A content-based video indexing method is presented in this paper that aims at temporally indexing a video sequence according to the actual speaker. This is achieved by the integration of audio and visual information. Audio analysis leads to the extraction of a speaker identity label versus time diagram. Visual analysis includes scene cut detection, face shot determination, mouth region extracti...
متن کاملSpeaker utterances tying among speaker segmented audio documents using hierarchical classification: towards speaker indexing of audio databases
Speaker indexing of an audio database consists in organizing the audio data according to the speakers present in the database. It is composed of three steps: (1) segmentation by speakers of each audio document; (2) speaker tying among the various segmented portions of the audio documents; and (3) generation of a speakerbased index. This paper focuses on the second step, the speaker tying task, ...
متن کاملOn-line Speaker Indexing Project Role in Support of Imsc Strategic Plan Discussion of Methodology Used
Unsupervised speaker indexing sequentially detects points where a speaker identity changes in a multi-speaker audio stream, and categorizes each speaker segment, without any prior knowledge about the speakers. This project addresses two challenges: The first relates to sequential speaker change detection. The second relates to speaker modeling in light of the fact that the number/identity of th...
متن کاملAudio Indexing: What Has Been Accomplished and the Road Ahead
This paper presents an overview of audio indexing, which has emerged very recently as a research topic with the development of Internet. A lot of data, including audio data, are currently not indexed by web search engines, and audio indexing consists in finding good descriptors of audio documents which can be used as indexes for archiving and search. We discuss speech/music segmentation, langua...
متن کاملSpeaker Identiication for Audio Indexing Applications
So a Tsekeridou and Ioannis Pitas Dept. of Informatics, Aristotle Univ. of Thessaloniki, Box 451, Thessaloniki 54006, GREECE Tel: +30 31 996304, Fax: +30 31 996304, e-mail: [email protected] ABSTRACT A method for identifying di erent speakers from an audio source of continuous speech is described in this paper aiming at extracting the speaker sequence, timing information and speaker identi...
متن کامل